Selective Sampling for Nearest Neighbor Classi ers

نویسندگان

  • Michael Lindenbaum
  • Shaul Markovich
  • Dmitry Rusakov
چکیده

In the passive, traditional, approach to learning, the information available to the learner is a set of classiied examples, which are randomly drawn from the instance space. In many applications, however, the initial clas-siication of the training set is a costly process, and an intelligently selection of training examples from unla-beled data is done by an active learner. This paper proposes a lookahead algorithm for example selection and addresses the problem of active learning in the context of nearest neighbor classiiers. The proposed approach relies on using a random eld model for the example labeling, which implies a dynamic change of the label estimates during the sampling process. The proposed selective sampling algorithm was evaluated empirically on artiicial and real data sets. The experiments show that the proposed method outper-forms other methods in most cases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets

Combining multiple classi ers is an e ective technique for improving accuracy. There are many general combining algorithms, such as Bagging or Error Correcting Output Coding, that signi cantly improve classi ers like decision trees, rule learners, or neural networks. Unfortunately, many combining methods do not improve the nearest neighbor classi er. In this paper, we present MFS, a combining a...

متن کامل

Eective supra-classi®ers for knowledge base construction

We explore the use of the supra-classi®er framework in the construction of a classi®er knowledge base. Previously, we introduced this framework within which labels produced by old classi®ers are used to improve the generalization performance of a new classi®er for a di€erent but related classi®cation task (Bollacker and Ghosh, 1998). We showed empirically that a simple Hamming nearest neighbor ...

متن کامل

Nearest Neighbor Classi cation from Multiple Feature Subsets

Combining multiple classi ers is an e ective technique for improving accuracy. There are many general combining algorithms, such as Bagging, Boosting, or Error Correcting Output Coding, that signi cantly improve classi ers like decision trees, rule learners, or neural networks. Unfortunately, these combining methods do not improve the nearest neighbor classi er. In this paper, we present MFS, a...

متن کامل

Nearest neighbor classi®er: Simultaneous editing and feature selection

Nearest neighbor classi®ers demand signi®cant computational resources (time and memory). Editing of the reference set and feature selection are two di€erent approaches to this problem. Here we encode the two approaches within the same genetic algorithm (GA) and simultaneously select features and reference cases. Two data sets were used: the SATIMAGE data and a generated data set. The GA was fou...

متن کامل

Cloud Classi cation Using Error-Correcting Output Codes

Novel arti cial intelligence methods are used to classify 16x16 pixel regions (obtained from Advanced Very High Resolution Radiometer (AVHRR) images) in terms of cloud type (e.g., stratus, cumulus, etc.). We previously reported that intelligent feature selection methods, combined with nearest neighbor classi ers, can dramatically improve classi cation accuracy on this task. Our subsequent analy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999